Multi-pass ASR using vocabulary expansion

نویسندگان

Katsutoshi Ohtsuki

Nobuaki Hiroshima

Shoichi Matsunaga

Yoshihiko Hayashi

چکیده

Current automatic speech recognition (ASR) systems have to limit their vocabulary size depending on available memory size, expected processing time, and available text data for building a vocabulary and a language model. Although the vocabularies of ASR systems are designed to achieve high coverage for the expected input data, it cannot be avoided that input data includes out-of-vocabulary (OOV) words. This is called the OOV problem. We propose dynamic vocabulary expansion using a conceptual base and multi-pass speech recognition using an expanded vocabulary. Relevant words to content of input speech are extracted based on a speech recognition result obtained using a reference vocabulary. An expanded vocabulary that includes fewer OOV words is built by adding the extracted words to the reference vocabulary. The second recognition process is performed using the new vocabulary. The experimental results for broadcast news speech show our method achieves a 30% reduction in OOV rate and improves speech recognition accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On-Line Language Model Biasing for Multi-Pass Automatic Speech Recognition

The language model (LM) is a critical component in statistical automatic speech recognition (ASR) systems, serving to establish a probability distribution over the hypothesis space. In typical use, the LM is trained off-line and remains static at run-time. While cache LMs, dialogue/style adaptation, and information retrieval-based biasing provide some ability for modifying the LM at run-time, t...

متن کامل

Investigation on the effects of ASR tuning on speech translation performance

In this paper we describe some of our recent investigations into ASR and SMT coupling issues from an ASR perspective. Our study was motivated by several areas: Firstly, to understand how standard ASR tuning procedures effect the SMT performance and whether it is safe to perform this tuning in isolation. Secondly, to investigate how vocabulary and segmentation mismatches between the ASR and SMT ...

متن کامل

Combining State-level and DNN-based Acoustic Matches for Efficient Spoken Term Detection in NTCIR-12 SpokenQuery&Doc-2 Task

Recently, in spoken document retrieval task such as spoken term detection (STD), there has been increasing interest in using a spoken query. In STD systems, automatic speech recognition (ASR) frontend is often employed for its reasonable accuracy and efficiency. However, out-of-vocabulary (OOV) problem at ASR stage has a great impact on the STD performance for spoken query. In this paper, we pr...

متن کامل

Dynamic language modeling for European Portuguese

This paper reports on the work done on vocabulary and language model daily adaptation for a European Portuguese broadcast news transcription system. The proposed adaptation framework takes into consideration European Portuguese language characteristics, such as its high level of inflection and complex verbal system. A multi-pass speech recognition framework using contemporary written texts avai...

متن کامل

Dual-Type Automatic Speech Recogniser Designs for Spoken Dialogue Systems

A Dual-Type automatic speech recogniser (ASR) is a multi-pass ASR system that incorporates both a speaker-independent (SI) and a speaker-dependent (SD) ASR. The purpose of this approach is to improve the robustness of spoken dialogue systems for a broader range of applications. This paper identifies feasible Dual-Type multi-pass ASR system designs that are intended to overcome limitations arisi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Multi-pass ASR using vocabulary expansion

نویسندگان

چکیده

منابع مشابه

On-Line Language Model Biasing for Multi-Pass Automatic Speech Recognition

Investigation on the effects of ASR tuning on speech translation performance

Combining State-level and DNN-based Acoustic Matches for Efficient Spoken Term Detection in NTCIR-12 SpokenQuery&Doc-2 Task

Dynamic language modeling for European Portuguese

Dual-Type Automatic Speech Recogniser Designs for Spoken Dialogue Systems

عنوان ژورنال:

اشتراک گذاری